NHL dataset


 

Dataset summary

Records 11,498
Columns 15
Memory usage 8.80 MB

Attribute # of categoriesCategoriesMemory usage
POSITION 12C, C RW, C; LW, Centr, D, F, G, L, LW, RW, W, nan657.83 KB
AGE 1717.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 37.012.03 KB
DRAFT_ROUND 101.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.011.73 KB
SHOOTS 4-, L, R, nan663.87 KB
PPG_CAT 6high, low, medium, nan, very high, very low686.81 KB
GPG_CAT 6high, low, medium, nan, very high, very low692.47 KB
APG_CAT 6high, low, medium, nan, very high, very low688.98 KB
PIMPG_CAT 6high, low, medium, nan, very high, very low688.86 KB
PLUS_MINUS_CAT 6high, low, medium, nan, very high, very low687.87 KB
POINT_SHARES_CAT 6high, low, medium, nan, very high, very low687.32 KB
GAMES_PLAYED_CAT 6high, low, medium, nan, very high, very low686.97 KB
NATIONALITY_CAT 10Canada, CzechRep., Finland, Germany, Russia, Slovakia, Sweden, Switzerland, USA, other702.56 KB
AMATEUR_LEAGUE_CAT 5NOT_DRAFTED, europe, north_america, other, russia766.39 KB
HEIGHT_CAT 5175-185, 185-195, <175, GIANT, nan693.11 KB
WEIGHT_CAT 7105-115, 115-130, 75-85, 85-95, 95-105, <75, nan685.28 KB

The profile of the variables
 

POSITION

Categories 12
Most frequent D (3,241 values, 28.19%)
Least frequent L (1 values, 0.01%)
mem_usage 657.83 KB
Missings 0 (0.00%)
POSITIONCountFrequencyPOSITION Frequency
C228119.84%C
19.84% 
C RW20.02%C RW
 
0.02%
C; LW20.02%C; LW
 
0.02%
Centr10.01%Centr
 
0.01%
D324128.19%D
28.19% 
F130.11%F
 
0.11%
G9818.53%G
8.53% 
L10.01%L
 
0.01%
LW166214.45%LW
14.45% 
RW162514.13%RW
14.13% 
W410.36%W
 
0.36%
nan164814.33%nan
14.33% 
Variable POSITION
Categories 12
Most frequent D (3,241 values, 28.19%)
Least frequent L (1 values, 0.01%)
mem_usage 657.83 KB
Missings 0 (0.00%)
POSITIONCountFrequency
C228119.84%
C RW20.02%
C; LW20.02%
Centr10.01%
D324128.19%
F130.11%
G9818.53%
L10.01%
LW166214.45%
RW162514.13%
W410.36%
nan164814.33%
POSITION
C
19.84% 
C RW
 
0.02%
C; LW
 
0.02%
Centr
 
0.01%
D
28.19% 
F
 
0.11%
G
8.53% 
L
 
0.01%
LW
14.45% 
RW
14.13% 
W
 
0.36%
nan
14.33% 

AGE

Categorical Ordered
Categories 18
Most frequent 18.0 (4,528 values, 39.38%)
Least frequent 32.0 (1 values, 0.01%)
mem_usage 12.03 KB
Missings 4,773 (41.51%)
AGECountFrequencyAGE Frequency
17.010.01%17.0
 
0.01%
18.0452839.38%18.0
39.38% 
19.0121210.54%19.0
10.54% 
20.07256.31%20.0
6.31% 
21.0600.52%21.0
 
0.52%
22.0360.31%22.0
 
0.31%
23.0310.27%23.0
 
0.27%
24.0250.22%24.0
 
0.22%
25.0320.28%25.0
 
0.28%
26.0210.18%26.0
 
0.18%
27.0240.21%27.0
 
0.21%
28.0110.10%28.0
 
0.10%
29.060.05%29.0
 
0.05%
30.080.07%30.0
 
0.07%
31.030.03%31.0
 
0.03%
32.010.01%32.0
 
0.01%
37.010.01%37.0
 
0.01%
Variable AGE
Categories 18
Most frequent 18.0 (4,528 values, 39.38%)
Least frequent 32.0 (1 values, 0.01%)
mem_usage 12.03 KB
Missings 4,773 (41.51%)
AGECountFrequency
17.010.01%
18.0452839.38%
19.0121210.54%
20.07256.31%
21.0600.52%
22.0360.31%
23.0310.27%
24.0250.22%
25.0320.28%
26.0210.18%
27.0240.21%
28.0110.10%
29.060.05%
30.080.07%
31.030.03%
32.010.01%
37.010.01%
AGE
17.0
 
0.01%
18.0
39.38% 
19.0
10.54% 
20.0
6.31% 
21.0
 
0.52%
22.0
 
0.31%
23.0
 
0.27%
24.0
 
0.22%
25.0
 
0.28%
26.0
 
0.18%
27.0
 
0.21%
28.0
 
0.10%
29.0
 
0.05%
30.0
 
0.07%
31.0
 
0.03%
32.0
 
0.01%
37.0
 
0.01%

DRAFT_ROUND

Categorical Ordered
Categories 11
Most frequent 1.0 (1,322 values, 11.50%)
Least frequent 10.0 (17 values, 0.15%)
mem_usage 11.73 KB
Missings 1,633 (14.20%)
DRAFT_ROUNDCountFrequencyDRAFT_ROUND Frequency
1.0132211.50%1.0
11.50% 
2.0131511.44%2.0
11.44% 
3.0131911.47%3.0
11.47% 
4.0131211.41%4.0
11.41% 
5.0128711.19%5.0
11.19% 
6.0128311.16%6.0
11.16% 
7.010979.54%7.0
9.54% 
8.06585.72%8.0
5.72% 
9.02552.22%9.0
 
2.22%
10.0170.15%10.0
 
0.15%
Variable DRAFT_ROUND
Categories 11
Most frequent 1.0 (1,322 values, 11.50%)
Least frequent 10.0 (17 values, 0.15%)
mem_usage 11.73 KB
Missings 1,633 (14.20%)
DRAFT_ROUNDCountFrequency
1.0132211.50%
2.0131511.44%
3.0131911.47%
4.0131211.41%
5.0128711.19%
6.0128311.16%
7.010979.54%
8.06585.72%
9.02552.22%
10.0170.15%
DRAFT_ROUND
1.0
11.50% 
2.0
11.44% 
3.0
11.47% 
4.0
11.41% 
5.0
11.19% 
6.0
11.16% 
7.0
9.54% 
8.0
5.72% 
9.0
 
2.22%
10.0
 
0.15%

SHOOTS

Categories 4
Most frequent nan (6,393 values, 55.60%)
Least frequent - (71 values, 0.62%)
mem_usage 663.87 KB
Missings 0 (0.00%)
SHOOTSCountFrequencySHOOTS Frequency
-710.62%-
 
0.62%
L321027.92%L
27.92% 
R182415.86%R
15.86% 
nan639355.60%nan
55.60% 
Variable SHOOTS
Categories 4
Most frequent nan (6,393 values, 55.60%)
Least frequent - (71 values, 0.62%)
mem_usage 663.87 KB
Missings 0 (0.00%)
SHOOTSCountFrequency
-710.62%
L321027.92%
R182415.86%
nan639355.60%
SHOOTS
-
 
0.62%
L
27.92% 
R
15.86% 
nan
55.60% 

PPG_CAT

Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent very high (163 values, 1.42%)
mem_usage 686.81 KB
Missings 0 (0.00%)
PPG_CATCountFrequencyPPG_CAT Frequency
high3032.64%high
 
2.64%
low136211.85%low
11.85% 
medium6685.81%medium
5.81% 
nan700360.91%nan
60.91% 
very high1631.42%very high
 
1.42%
very low199917.39%very low
17.39% 
Variable PPG_CAT
Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent very high (163 values, 1.42%)
mem_usage 686.81 KB
Missings 0 (0.00%)
PPG_CATCountFrequency
high3032.64%
low136211.85%
medium6685.81%
nan700360.91%
very high1631.42%
very low199917.39%
PPG_CAT
high
 
2.64%
low
11.85% 
medium
5.81% 
nan
60.91% 
very high
 
1.42%
very low
17.39% 

GPG_CAT

Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent very high (5 values, 0.04%)
mem_usage 692.47 KB
Missings 0 (0.00%)
GPG_CATCountFrequencyGPG_CAT Frequency
high60.05%high
 
0.05%
low6495.64%low
5.64% 
medium690.60%medium
 
0.60%
nan700360.91%nan
60.91% 
very high50.04%very high
 
0.04%
very low376632.75%very low
32.75% 
Variable GPG_CAT
Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent very high (5 values, 0.04%)
mem_usage 692.47 KB
Missings 0 (0.00%)
GPG_CATCountFrequency
high60.05%
low6495.64%
medium690.60%
nan700360.91%
very high50.04%
very low376632.75%
GPG_CAT
high
 
0.05%
low
5.64% 
medium
 
0.60%
nan
60.91% 
very high
 
0.04%
very low
32.75% 

APG_CAT

Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent very high (14 values, 0.12%)
mem_usage 688.98 KB
Missings 0 (0.00%)
APG_CATCountFrequencyAPG_CAT Frequency
high450.39%high
 
0.39%
low123510.74%low
10.74% 
medium3132.72%medium
 
2.72%
nan700360.91%nan
60.91% 
very high140.12%very high
 
0.12%
very low288825.12%very low
25.12% 
Variable APG_CAT
Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent very high (14 values, 0.12%)
mem_usage 688.98 KB
Missings 0 (0.00%)
APG_CATCountFrequency
high450.39%
low123510.74%
medium3132.72%
nan700360.91%
very high140.12%
very low288825.12%
APG_CAT
high
 
0.39%
low
10.74% 
medium
 
2.72%
nan
60.91% 
very high
 
0.12%
very low
25.12% 

PIMPG_CAT

Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent high (505 values, 4.39%)
mem_usage 688.86 KB
Missings 0 (0.00%)
PIMPG_CATCountFrequencyPIMPG_CAT Frequency
high5054.39%high
 
4.39%
low9278.06%low
8.06% 
medium8307.22%medium
7.22% 
nan700360.91%nan
60.91% 
very high122210.63%very high
10.63% 
very low10118.79%very low
8.79% 
Variable PIMPG_CAT
Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent high (505 values, 4.39%)
mem_usage 688.86 KB
Missings 0 (0.00%)
PIMPG_CATCountFrequency
high5054.39%
low9278.06%
medium8307.22%
nan700360.91%
very high122210.63%
very low10118.79%
PIMPG_CAT
high
 
4.39%
low
8.06% 
medium
7.22% 
nan
60.91% 
very high
10.63% 
very low
8.79% 

PLUS_MINUS_CAT

Categories 6
Most frequent nan (7,013 values, 60.99%)
Least frequent high (313 values, 2.72%)
mem_usage 687.87 KB
Missings 0 (0.00%)
PLUS_MINUS_CATCountFrequencyPLUS_MINUS_CAT Frequency
high3132.72%high
 
2.72%
low9798.51%low
8.51% 
medium139912.17%medium
12.17% 
nan701360.99%nan
60.99% 
very high8837.68%very high
7.68% 
very low9117.92%very low
7.92% 
Variable PLUS_MINUS_CAT
Categories 6
Most frequent nan (7,013 values, 60.99%)
Least frequent high (313 values, 2.72%)
mem_usage 687.87 KB
Missings 0 (0.00%)
PLUS_MINUS_CATCountFrequency
high3132.72%
low9798.51%
medium139912.17%
nan701360.99%
very high8837.68%
very low9117.92%
PLUS_MINUS_CAT
high
 
2.72%
low
8.51% 
medium
12.17% 
nan
60.99% 
very high
7.68% 
very low
7.92% 

POINT_SHARES_CAT

Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent low (844 values, 7.34%)
mem_usage 687.32 KB
Missings 0 (0.00%)
POINT_SHARES_CATCountFrequencyPOINT_SHARES_CAT Frequency
high8977.80%high
7.80% 
low8447.34%low
7.34% 
medium8797.64%medium
7.64% 
nan700360.91%nan
60.91% 
very high8967.79%very high
7.79% 
very low9798.51%very low
8.51% 
Variable POINT_SHARES_CAT
Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent low (844 values, 7.34%)
mem_usage 687.32 KB
Missings 0 (0.00%)
POINT_SHARES_CATCountFrequency
high8977.80%
low8447.34%
medium8797.64%
nan700360.91%
very high8967.79%
very low9798.51%
POINT_SHARES_CAT
high
7.80% 
low
7.34% 
medium
7.64% 
nan
60.91% 
very high
7.79% 
very low
8.51% 

GAMES_PLAYED_CAT

Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent medium (894 values, 7.78%)
mem_usage 686.97 KB
Missings 0 (0.00%)
GAMES_PLAYED_CATCountFrequencyGAMES_PLAYED_CAT Frequency
high9017.84%high
7.84% 
low9067.88%low
7.88% 
medium8947.78%medium
7.78% 
nan700360.91%nan
60.91% 
very high8957.78%very high
7.78% 
very low8997.82%very low
7.82% 
Variable GAMES_PLAYED_CAT
Categories 6
Most frequent nan (7,003 values, 60.91%)
Least frequent medium (894 values, 7.78%)
mem_usage 686.97 KB
Missings 0 (0.00%)
GAMES_PLAYED_CATCountFrequency
high9017.84%
low9067.88%
medium8947.78%
nan700360.91%
very high8957.78%
very low8997.82%
GAMES_PLAYED_CAT
high
7.84% 
low
7.88% 
medium
7.78% 
nan
60.91% 
very high
7.78% 
very low
7.82% 

NATIONALITY_CAT

Categories 10
Most frequent Canada (5,847 values, 50.85%)
Least frequent Germany (80 values, 0.70%)
mem_usage 702.56 KB
Missings 0 (0.00%)
NATIONALITY_CATCountFrequencyNATIONALITY_CAT Frequency
Canada584750.85%Canada
50.85% 
CzechRep.4944.30%CzechRep.
 
4.30%
Finland4834.20%Finland
 
4.20%
Germany800.70%Germany
 
0.70%
Russia7096.17%Russia
6.17% 
Slovakia1871.63%Slovakia
 
1.63%
Sweden8046.99%Sweden
6.99% 
Switzerland800.70%Switzerland
 
0.70%
USA254622.14%USA
22.14% 
other2682.33%other
 
2.33%
Variable NATIONALITY_CAT
Categories 10
Most frequent Canada (5,847 values, 50.85%)
Least frequent Germany (80 values, 0.70%)
mem_usage 702.56 KB
Missings 0 (0.00%)
NATIONALITY_CATCountFrequency
Canada584750.85%
CzechRep.4944.30%
Finland4834.20%
Germany800.70%
Russia7096.17%
Slovakia1871.63%
Sweden8046.99%
Switzerland800.70%
USA254622.14%
other2682.33%
NATIONALITY_CAT
Canada
50.85% 
CzechRep.
 
4.30%
Finland
 
4.20%
Germany
 
0.70%
Russia
6.17% 
Slovakia
 
1.63%
Sweden
6.99% 
Switzerland
 
0.70%
USA
22.14% 
other
 
2.33%

AMATEUR_LEAGUE_CAT

Categories 5
Most frequent north_america (7,462 values, 64.90%)
Least frequent other (119 values, 1.03%)
mem_usage 766.39 KB
Missings 0 (0.00%)
AMATEUR_LEAGUE_CATCountFrequencyAMATEUR_LEAGUE_CAT Frequency
NOT_DRAFTED163314.20%NOT_DRAFTED
14.20% 
europe169014.70%europe
14.70% 
north_america746264.90%north_america
64.90% 
other1191.03%other
 
1.03%
russia5945.17%russia
5.17% 
Variable AMATEUR_LEAGUE_CAT
Categories 5
Most frequent north_america (7,462 values, 64.90%)
Least frequent other (119 values, 1.03%)
mem_usage 766.39 KB
Missings 0 (0.00%)
AMATEUR_LEAGUE_CATCountFrequency
NOT_DRAFTED163314.20%
europe169014.70%
north_america746264.90%
other1191.03%
russia5945.17%
AMATEUR_LEAGUE_CAT
NOT_DRAFTED
14.20% 
europe
14.70% 
north_america
64.90% 
other
 
1.03%
russia
5.17% 

HEIGHT_CAT

Categories 5
Most frequent nan (6,397 values, 55.64%)
Least frequent <175 (97 values, 0.84%)
mem_usage 693.11 KB
Missings 0 (0.00%)
HEIGHT_CATCountFrequencyHEIGHT_CAT Frequency
175-185219619.10%175-185
19.10% 
185-195261822.77%185-195
22.77% 
<175970.84%<175
 
0.84%
GIANT1901.65%GIANT
 
1.65%
nan639755.64%nan
55.64% 
Variable HEIGHT_CAT
Categories 5
Most frequent nan (6,397 values, 55.64%)
Least frequent <175 (97 values, 0.84%)
mem_usage 693.11 KB
Missings 0 (0.00%)
HEIGHT_CATCountFrequency
175-185219619.10%
185-195261822.77%
<175970.84%
GIANT1901.65%
nan639755.64%
HEIGHT_CAT
175-185
19.10% 
185-195
22.77% 
<175
 
0.84%
GIANT
 
1.65%
nan
55.64% 

WEIGHT_CAT

Categories 7
Most frequent nan (6,397 values, 55.64%)
Least frequent 115-130 (15 values, 0.13%)
mem_usage 685.28 KB
Missings 0 (0.00%)
WEIGHT_CATCountFrequencyWEIGHT_CAT Frequency
105-1151541.34%105-115
 
1.34%
115-130150.13%115-130
 
0.13%
75-8510649.25%75-85
9.25% 
85-95262622.84%85-95
22.84% 
95-105121910.60%95-105
10.60% 
<75230.20%<75
 
0.20%
nan639755.64%nan
55.64% 
Variable WEIGHT_CAT
Categories 7
Most frequent nan (6,397 values, 55.64%)
Least frequent 115-130 (15 values, 0.13%)
mem_usage 685.28 KB
Missings 0 (0.00%)
WEIGHT_CATCountFrequency
105-1151541.34%
115-130150.13%
75-8510649.25%
85-95262622.84%
95-105121910.60%
<75230.20%
nan639755.64%
WEIGHT_CAT
105-115
 
1.34%
115-130
 
0.13%
75-85
9.25% 
85-95
22.84% 
95-105
10.60% 
<75
 
0.20%
nan
55.64% 

Correlations

Overall correlations


Individual correlations


































































































































































































































 
 
Created by pandas profiling categorical version 0.1.1